Summarization and Matching of Density-Based Clusters in Streaming Environments
نویسندگان
چکیده
Density-based cluster mining is known to serve a broad range of applications ranging from stock trade analysis to moving object monitoring. Although methods for efficient extraction of density-based clusters have been studied in the literature, the problem of summarizing and matching of such clusters with arbitrary shapes and complex cluster structures remains unsolved. Therefore, the goal of our work is to extend the state-of-art of density-based cluster mining in streams from cluster extraction only to now also support analysis and management of the extracted clusters. Our work solves three major technical challenges. First, we propose a novel multi-resolution cluster summarization method, called Skeletal Grid Summarization (SGS), which captures the key features of density-based clusters, covering both their external shape and internal cluster structures. Second, in order to summarize the extracted clusters in real-time, we present an integrated computation strategy C-SGS, which piggybacks the generation of cluster summarizations within the online clustering process. Lastly, we design a mechanism to efficiently execute cluster matching queries, which identify similar clusters for given cluster of analyst’s interest from clusters extracted earlier in the stream history. Our experimental study using real streaming data shows the clear superiority of our proposed methods in both efficiency and effectiveness for cluster summarization and cluster matching queries to other potential alternatives.
منابع مشابه
Graph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملXML Dissemination Scheme for Mobile Computing Based on Lineage Encoding
In wireless environments, broadcasting is an efficient and scalable method to broadcast information to a massive number of clients. We propose an energy and latency efficient XML dissemination scheme for the wireless mobile computing environments. This paper presents a novel unit structure called G-node for streaming XML data in the wireless system. It applies the benefits of the structure inde...
متن کاملDissemination of Xml Data in Wireless Environment Supporting Twig Pattern Queries
The main aim of this paper is to improve energy and latency efficiency of XML dissemination scheme for the mobile computing, which is based on Lineage Encoding, G-node and scheduling algorithm for streaming XML data in the wireless environment. In this paper we propose a new broadcasting scheduling algorithm Frequently Access First (FAF) which effectively organize XML data on wireless channels....
متن کاملSite Regression Biplot Analysis for Matching New Improved Lentil Genotypes into Target Environments
Abstract The evaluation of the yield stability of genotypes and environment is of prime concern to plant breeders. Therefore, a comprehensive analysis of the structure of the GE interaction is needed. The objective of this investigation was to evaluate the use of sites regression (SREG) GGE methodology to stratify the pe × environment (GE) interaction in lentil. Yield data of 10 genotypes of le...
متن کاملمرور مؤثر نتایج جستجوی تصاویر با تلخیص بصری و متنوع از طریق خوشهبندی
With unprecedented growth in production of digital images and use of multimedia references, requirement of image and subject search has been increased. Systematic processing of this information is a basic prerequisite for effective analysis, organization and management of it. Likewise, large collections of images have been made available on the Web and many search engines have provided the poss...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 5 شماره
صفحات -
تاریخ انتشار 2011